1. Field of the Invention
The present invention relates to media content and more specifically to a system and method for indexing, searching within and retrieving media content and including a textual analysis of the media content to the user.
2. Introduction
Those who frequently use the Internet are familiar with search engines such as Google® and Yahoo®. Search engines have proven highly useful in taking a text query from a user and searching within web pages to retrieve related information containing the queried text. The nature of web pages allows their text content to be easily searched. While this is valuable, it is also limiting, because it excludes a host of media from being searched. The content of media presentations such as motion pictures, songs, and printed publications are not searchable in their original form.
As the body of motion pictures, songs, books, and other works expands, so does the body of well-known lines and phrases from these works. Lines like, “I am your father,” or “a three-hour tour,” or “It was the best of times, it was the worst of times,” are recognized almost universally as lines from the films, songs, or books that made them famous. They are often quoted and imitated in other media presentations, becoming incorporated into popular idioms and expressions. Many people may remember a movie or a song by such a phrase where they may forget the movie's title or the actors in the movie.
The origins of these phrases and their impact on language and society are topics of scholarly study. Famous lines end up changing the way a culture may communicate. The popularity of these phrases may also be utilized as a marketing tool. Often, the lasting memory of a movie, song, or book is encapsulated by just a few words in the minds of its audience.
Many media presentations are viewed, heard, read, rented, bought and sold worldwide in a great variety of formats. One highly popular format is the digital video disk (DVD). Sales of DVDs have become very popular and in some cases revenues from DVD sales outpace revenues from movie theater ticket sales. Growth and development in several areas—especially the internet—are bringing an ever wider variety of options, titles, and sources of media content to consumers. Also, a proliferation of illicit sources and media distribution methods poses a challenge to the legal owners of copyrighted works. It is desirable for content providers and copyright owners to allow consumers to search for, discover, and learn about the available media. There are numerous media presentations which are not demanded by consumers simply because consumers are not aware that they exist.
Many consumers may desire to buy a particular movie but cannot remember the title or main actors. The same problem may occur in trying to find a particular song, but remembering only a few words or phrases from the lyrics. Furthermore, those doing media research may desire to draw comparisons between the content of several different presentations. Without searchable access and retrieval capabilities, the difficulty of each of these tasks is increased.
Amazon.com® provides one example of how users may identify through a title or author search media content and be able to purchase the content. For example, one Amazon.com feature enables the user to search via book title and then view a page and move forward or backward a few pages in the book. This information helps the user determine whether to purchase the book. Selected pages are shown such as the index, table of contents, and an excerpt from a few pages within the book. There are limitations to this approach in that unless the user knows the title or author, it can be difficult to locate or identify the book. A user may only know a few phrases from the book or movie. The Amazon.com search engine only searches titles and therefore unless that information is known, searching via media content is unworkable.
Other search engines are similar. For example, Google's® search engine does not include the content of media but will return web pages that contain the search terms. Therefore, a user seeking to identify media that contains certain words or phrases cannot identify the media via a Google search.
Legal issues exist in the realm of searching content. When Google returns a listing of web pages, the search engine only reports several words from the particular web page. When a user “clicks” on that listing, the user's web browser is pointed to the originating web page and thus is sent to the content owner's webpage. The use of the few words to describe the webpage as a result of a Google search does not implicate copyright infringement. In the context of obtaining searchable media such as movies, songs or printed media, the ability to redirect users to the source of the content for viewing the actual content becomes problematic in terms of copyright protection. Unlike web-pages that are freely available, not all copyright owners place songs, books or other printed media or movies on the Internet for free viewing and linking.
Furthermore, video indexing and searching capability may enable users to find desired video content but users often have more interest in the context, structure and organization of the video content. Most search engines that present users with listings of media content that match a search string do not provide much, if any, additional data about the particular media content.
What is needed in the art is a new method for enabling the searching of media content and specifically searching for words and phrases within media content while maintaining the rights of copyright owners. Furthermore, what is needed in the art is more information about retrieved media content to provide a user with more useful analysis of the media presentations.